release: gastown-staging -> main by jrf0110 · Pull Request #3151 · Kilo-Org/cloud

jrf0110 · 2026-05-09T21:23:01Z

Summary

Promotes 5 commits from gastown-staging to main. Three independent fix groups plus a developer-facing test procedure:

Boot-hydration timeout fix — unblocks /agents/start during container boot hydration and preserves mayor tools on prewarm.
GitHub auth correctness — fresh integration tokens instead of stale stored value, plus distinguished failure messages when no token resolves.
Logging hygiene — redundant request-logging middleware replaced with per-route Hono-param tagging.
Developer tooling — dev-only convoy debug endpoints and a deterministic review-then-land E2E test procedure.

Also lowers TownContainerDO.max_instances from 800 → 500 (as part of commit 1).

Constituent commits

1. Boot hydration + mayor prewarm fix (`2ffcef28f`, direct push)

Three independent fixes for the startAgentInContainer timeout regression observed after #2974, plus a tighter container-instance cap.

Symptoms. Production logs were filling with two error patterns since the last gastown-staging→main promotion:

[<DOMAIN>] startAgentInContainer: EXCEPTION for agent <UUID>: TimeoutError: The operation was aborted due to timeout
timeout after 6000ms: ensureSDKServer for <agentId>

Root cause. The control server starts accepting requests immediately at boot (main.ts:83), while bootHydration() runs concurrently and serialises every registry agent + the new mayor prewarm through the global sdkServerLock (createKilo reads process.cwd()/process.env). Fresh /agents/start, /refresh-token, and PATCH /agents/:id/model requests queued behind that work and the DO-side AbortSignal.timeout(60s) (resp. REFRESH_AGENT_TIMEOUT_MS=6_000) fired before they ever got the lock.

The mayor prewarm added in #3122 made things worse on two axes:

It built KILO_CONFIG_CONTENT from hardcoded model defaults, so the real /agents/start with the user's actual model triggered ensureSDKServer's "config mismatch — evicting prewarmed server" path on every warm restart, doubling lock-holding time on the critical path the prewarm was supposed to speed up.
It was missing GASTOWN_AGENT_ROLE, GASTOWN_AGENT_ID, and GASTOWN_TOWN_ID from the prewarm env. kilo serve snapshots process.env at spawn, and plugin/index.ts:66 keys mayor-tool registration off GASTOWN_AGENT_ROLE === 'mayor'. Without those, the prewarmed server booted with no mayor tools, and the cache hit on the next /agents/start handed that defective instance back to the user — manifesting as "mayor tools became unavailable."

Changes

1. Hydration gate (`control-server.ts`, `process-manager.ts`)

New awaitHydration() exported from process-manager.ts: a promise that bootHydration replaces on entry and resolves in a finally. Awaited at the top of /agents/start, /refresh-token, and PATCH /agents/:id/model (before any process.env mutation in the model PATCH path so concurrent requests can't race on env writes before holding the SDK lock). Default-resolved at module init so test/dev contexts that never run hydration aren't blocked.

2. Prewarm config matches `/agents/start` (`Town.do.ts`, `gastown.worker.ts`, `process-manager.ts`)

New getMayorPrewarmContext() on TownDO returns { agentId, model, smallModel, kilocodeToken, organizationId } resolved the same way _ensureMayor resolves them (config.resolveModel(townConfig, null, 'mayor')). The /api/towns/:townId/mayor-id endpoint now returns that whole context so the container builds a KILO_CONFIG_CONTENT byte-identical to what the next /agents/start will send. Falls back to the bare { agentId } shape for back-compat; the container skips prewarm when model/token aren't available rather than building a config that's guaranteed to mismatch.

3. Mayor workdir + plugin env (`agent-runner.ts`, `process-manager.ts`)

Exported ensureMayorWorkspaceForTown(townId) so prewarmMayorSDK materialises the workspace before ensureSDKServer's process.chdir (was throwing ENOENT on cold containers).
buildPrewarmEnv now mirrors the mayor-shaped subset of buildAgentEnv: GASTOWN_AGENT_ID, GASTOWN_AGENT_ROLE='mayor', GASTOWN_TOWN_ID, KILOCODE_FEATURE='gastown', KILO_TEST_HOME, XDG_DATA_HOME. New end-to-end test intercepts createKilo and asserts those keys are visible to the spawn.

4. `wrangler.jsonc`

Lowered TownContainerDO.max_instances from 800 → 500 (manual change).

2. Remove manual request logging middleware (#3158, `a6cf1029b`)

Removes the redundant request-logging middleware in gastown.worker.ts that logged every request twice (-->/<-- via logger.info) — already covered by the per-route instrumented(c, route, handler) AE event wrapper. Replaces the regex-based logger.setTags block with proper per-route tagging using Hono c.req.param() matching for :orgId / :townId / :rigId / :agentId prefixes. Net diff: ~30 deletions + ~25 additions.

Link: #3158

3. Convoy debug endpoints + E2E test procedure (`7f9121ffa`, direct push)

Adds three dev-only debug endpoints for autonomous convoy testing without going through the mayor LLM:

GET /debug/towns/:townId/rigs — list rigs in a town
POST /debug/towns/:townId/sling-convoy — call Town.slingConvoy() directly
GET /debug/towns/:townId/convoys — list active convoys with progress

Documents the new endpoints and adds a Test C section to services/gastown/docs/e2e-pr-feedback-testing.md with a deterministic procedure for verifying review-then-land convoys end-to-end (sub-bead PRs into the convoy feature branch, then a landing PR into main). Captures known issues observed during verification: container MTU/TLS handshake failures with github.com, 'failed' blockers not gating dependents, and intermittent polecat skipping of sub-PR creation.

4. Fresh integration tokens for GitHub auth (`ce15a6fe7`, direct push)

resolveGitHubToken previously preferred git_auth.github_token over the platform integration. Since GitHub App installation tokens have a 1h TTL but git_auth.github_token is only written at rig creation (or rare manual refresh), every long-lived town with an integration was handing out an expired token to:

Polecat/refinery gh CLI (via GH_TOKEN derived from GIT_TOKEN in the container), surfacing as "Failed to log in to github.com using token (GH_TOKEN). The token in GH_TOKEN is invalid."
The worker-side PR poller (checkPRStatus, checkPRFeedback, mergePR, areThreadsBlocking) — 401 from api.github.com.
The /refresh-git-token endpoint the container falls back to on auth failure — it returned the same expired token, so the retry just re-failed.

Fix flips priority to github_cli_pat → live integration → stored github_token (last-resort fallback for towns with no integration). Empty-string responses from the integration service now warn and fall back instead of silently failing. Resolves a fresh token at agent dispatch (startAgentInContainer), merge dispatch (startMergeInContainer), and rig setup (setupRigRepoInContainer) before stuffing GIT_TOKEN into envVars. buildContainerConfig now resolves a fresh token before serializing git_auth.github_token into the X-Town-Config header. Adds 6 unit tests covering the priority chain.

5. Distinguish null causes in PR status polling (#3160, `63873e425`)

Fixes #3149.

Replace PRStatusResult | null return type with discriminated PRStatusOutcome union in checkPRStatus. Each null cause (no token, HTTP error, invalid response, unrecognized URL, host mismatch) now surfaces a structured PRStatusError with actionable failure messages.

Key changes:

resolveGitHubToken returns GitHubTokenResolution with resolution chain tracking which sources were tried (back-compat helper resolveGitHubTokenString exists for non-error-aware callers).
no_token and non-transient HTTP errors (401/403/404) fail the bead immediately (1 strike).
invalid_response/unrecognized_url/host_mismatch fail after 3 strikes.
Transient HTTP errors (5xx/429) keep existing 10-strike behavior.
poll_transient_count and poll_non_transient_count separate counters (replaces the cross-contaminated single poll_null_count); both reset on successful poll.
failureKind persisted to bead metadata for analytics.
AE event pr.poll_failed emitted on terminal failure.
resolveGitHubToken tracks the configured integration source even when GIT_TOKEN_SERVICE binding is missing.

Link: #3160

Verification

Unit-tested the hydration gate end-to-end with a fetch barrier (asserts awaiters block while bootHydration is in flight, release when it returns).
Unit-tested the prewarm env shape end-to-end (drives bootHydration with a /mayor-id fetch mock, intercepts createKilo, asserts GASTOWN_AGENT_ID, GASTOWN_AGENT_ROLE='mayor', GASTOWN_TOWN_ID, GASTOWN_CONTAINER_TOKEN, and a non-empty KILO_CONFIG_CONTENT are all visible at spawn time).
Reviewed the _ensureMayor model-resolution path to confirm resolveModel(townConfig, null, 'mayor') is byte-identical to what /agents/start will send (mayor role ignores rigOverride entirely in config.resolveModel).
Manual production verification deferred — these changes target a hot path that's hard to reproduce locally; will monitor Sentry / AE mayor.ensure_decision: short_circuit_warm and agent.startup_phase after merge.
pnpm --filter cloudflare-gastown typecheck passes.
Unit tests for PR polling: test/unit/pr-poll-errors.test.ts (checkPRStatus, resolveGitHubToken), test/unit/pr-poll-thresholds.test.ts (failureMessageFor, shouldFailImmediately, shouldCountAsTransient).
Integration test for no_token immediate-fail path: test/integration/pr-poll-errors.test.ts.
HTTP error scenarios covered by unit tests (mocking fetch is not practical in Cloudflare Workers integration test runtime).

Visual Changes

N/A

Reviewer Notes

The /api/towns/:townId/mayor-id response shape is back-compat: the container's Zod schema (MayorPrewarmResponse) accepts both the new full-context shape and the legacy { agentId } shape with .passthrough(), and rolls back to "skip prewarm" on missing fields.
The organizationId fallback chain in buildPrewarmEnv distinguishes undefined (older worker, fall back to process.env) from null (worker authoritatively says "no org") so a stale env-var value can't override an authoritative null.
The hydration gate is a single global promise — bootHydration is currently single-call from main.ts. If we ever add periodic re-hydration, the resolver capture should move to a local inside bootHydration (called out in code review as a SUGGESTION, deferred).
Two SUGGESTION-level findings deferred from code review: (a) prewarmMayorSDK warns but doesn't bail on workdir-mismatch (cheap to harden later), (b) one negative-case timing assertion in the new test relies on a 10ms setTimeout (test still validates the positive case deterministically).
The refresh-git-token.handler.ts change is a caller update for the new GitHubTokenResolution return type (was string | null).
The wrangler.jsonc max_instances change (800→500) is from the boot hydration commit (2ffcef28f).

…ayor tools on prewarm Three independent fixes for the startAgentInContainer timeout regression introduced by #2974, plus a tighter container-instance cap. 1. Hydration gate (control-server.ts, process-manager.ts) The control server starts accepting requests immediately at boot, while bootHydration runs concurrently and serialises every registry agent + the mayor prewarm through the global sdkServerLock. Fresh /agents/start, /refresh-token, and PATCH /agents/:id/model requests queued behind that work and the DO-side AbortSignal.timeout(60s) fired before they ever got the lock — surfacing as "TimeoutError: aborted due to timeout" and "timeout after 6000ms: ensureSDKServer for <agentId>". A new awaitHydration() promise is awaited at the top of those handlers (before any process.env mutation in the model PATCH path) so they don't compound the queue. 2. Prewarm config matches /agents/start (Town.do.ts, gastown.worker.ts, process-manager.ts) buildPrewarmEnv was constructing KILO_CONFIG_CONTENT from hardcoded defaults (anthropic/claude-sonnet-4.6 / claude-haiku-4.5), so the real /agents/start with the user's actual model triggered ensureSDKServer's "config mismatch, evicting prewarmed server" path on every warm restart — doubling lock-holding time on the critical path the prewarm was supposed to speed up. The /api/towns/:id/mayor-id endpoint now returns the full prewarm context (model, smallModel, kilocodeToken, organizationId) resolved the same way _ensureMayor resolves it, and the container builds the prewarm KILO_CONFIG_CONTENT to match. Falls back gracefully to a skip when the worker hasn't deployed the richer endpoint yet. 3. Mayor workdir + plugin env (agent-runner.ts, process-manager.ts) prewarmMayorSDK called mayorWorkdirForTown (which only returns a string) and went straight to ensureSDKServer's process.chdir, throwing ENOENT on cold containers because createMayorWorkspace only ran from runAgent. Exported ensureMayorWorkspaceForTown so prewarm materialises the workspace first. More critically, buildPrewarmEnv was missing GASTOWN_AGENT_ROLE, GASTOWN_AGENT_ID, and GASTOWN_TOWN_ID — env vars the kilo serve plugin (plugin/index.ts) reads at spawn to decide whether to register mayor tools. Without them the prewarmed server booted with NO mayor tools, and the cache hit on the next /agents/start handed that defective instance back to the user. Now mirrors the mayor- shaped subset of buildAgentEnv. Added an end-to-end test that intercepts createKilo and asserts the env at spawn time. 4. wrangler.jsonc: lower TownContainerDO max_instances from 800 to 500. Verified with pnpm --filter gastown-container test (67/67 pass), pnpm --filter cloudflare-gastown typecheck, oxlint, and pnpm format.

kilo-code-bot · 2026-05-09T21:26:11Z

Code Review Summary

Status: No Issues Found | Recommendation: Merge

✅ All Previously Flagged Issues Resolved

File	Issue	Status
`services/gastown/container/src/control-server.ts`	`process.env.GASTOWN_CONTAINER_TOKEN` mutated before `awaitHydration()`	✅ Fixed
`services/gastown/container/src/process-manager.ts`	`_resolveHydration` module-global stale-capture could orphan resolver	✅ Fixed
`services/gastown/src/gastown.worker.ts`	Double RPC call to TownDO in mayor-id endpoint	✅ Fixed
`services/gastown/src/gastown.worker.ts`	`rigId` not tagged for `/api/users/:userId/rigs/:rigId` routes	✅ Fixed
`services/gastown/src/dos/town/actions.ts`	Misleading migration comment in `poll_non_transient_count` branch	✅ Fixed

Files Reviewed (all commits)

services/gastown/container/src/agent-runner.ts
services/gastown/container/src/control-server.ts
services/gastown/container/src/process-manager.test.ts
services/gastown/container/src/process-manager.ts
services/gastown/docs/e2e-pr-feedback-testing.md
services/gastown/src/dos/Town.do.ts
services/gastown/src/dos/town/actions.ts
services/gastown/src/dos/town/config.ts
services/gastown/src/dos/town/container-dispatch.ts
services/gastown/src/dos/town/town-scm.ts
services/gastown/src/gastown.worker.ts
services/gastown/src/handlers/refresh-git-token.handler.ts
services/gastown/test/integration/pr-poll-errors.test.ts
services/gastown/test/unit/pr-poll-errors.test.ts
services/gastown/test/unit/pr-poll-thresholds.test.ts
services/gastown/test/unit/town-scm.test.ts
services/gastown/wrangler.jsonc

_{Reviewed by claude-sonnet-4.6 · 532,046 tokens}

* chore(gastown): remove manual request logging middleware * fix(gastown): unblock /agents/start during boot hydration; preserve mayor tools on prewarm Three independent fixes for the startAgentInContainer timeout regression introduced by #2974, plus a tighter container-instance cap. 1. Hydration gate (control-server.ts, process-manager.ts) The control server starts accepting requests immediately at boot, while bootHydration runs concurrently and serialises every registry agent + the mayor prewarm through the global sdkServerLock. Fresh /agents/start, /refresh-token, and PATCH /agents/:id/model requests queued behind that work and the DO-side AbortSignal.timeout(60s) fired before they ever got the lock — surfacing as "TimeoutError: aborted due to timeout" and "timeout after 6000ms: ensureSDKServer for <agentId>". A new awaitHydration() promise is awaited at the top of those handlers (before any process.env mutation in the model PATCH path) so they don't compound the queue. 2. Prewarm config matches /agents/start (Town.do.ts, gastown.worker.ts, process-manager.ts) buildPrewarmEnv was constructing KILO_CONFIG_CONTENT from hardcoded defaults (anthropic/claude-sonnet-4.6 / claude-haiku-4.5), so the real /agents/start with the user's actual model triggered ensureSDKServer's "config mismatch, evicting prewarmed server" path on every warm restart — doubling lock-holding time on the critical path the prewarm was supposed to speed up. The /api/towns/:id/mayor-id endpoint now returns the full prewarm context (model, smallModel, kilocodeToken, organizationId) resolved the same way _ensureMayor resolves it, and the container builds the prewarm KILO_CONFIG_CONTENT to match. Falls back gracefully to a skip when the worker hasn't deployed the richer endpoint yet. 3. Mayor workdir + plugin env (agent-runner.ts, process-manager.ts) prewarmMayorSDK called mayorWorkdirForTown (which only returns a string) and went straight to ensureSDKServer's process.chdir, throwing ENOENT on cold containers because createMayorWorkspace only ran from runAgent. Exported ensureMayorWorkspaceForTown so prewarm materialises the workspace first. More critically, buildPrewarmEnv was missing GASTOWN_AGENT_ROLE, GASTOWN_AGENT_ID, and GASTOWN_TOWN_ID — env vars the kilo serve plugin (plugin/index.ts) reads at spawn to decide whether to register mayor tools. Without them the prewarmed server booted with NO mayor tools, and the cache hit on the next /agents/start handed that defective instance back to the user. Now mirrors the mayor- shaped subset of buildAgentEnv. Added an end-to-end test that intercepts createKilo and asserts the env at spawn time. 4. wrangler.jsonc: lower TownContainerDO max_instances from 800 to 500. Verified with pnpm --filter gastown-container test (67/67 pass), pnpm --filter cloudflare-gastown typecheck, oxlint, and pnpm format. * feat(gastown): per-route logger tagging via Hono params (review on #3158) --------- Co-authored-by: John Fawcett <john@kilcoode.ai>

…st procedure Adds three dev-only debug endpoints for autonomous convoy testing without going through the mayor LLM: - GET /debug/towns/:townId/rigs — list rigs in a town - POST /debug/towns/:townId/sling-convoy — call Town.slingConvoy() directly - GET /debug/towns/:townId/convoys — list active convoys with progress Documents the new endpoints and adds a Test C section to e2e-pr-feedback-testing.md with a deterministic procedure for verifying review-then-land convoys end-to-end (sub-bead PRs into the convoy feature branch, then a landing PR into main). Also captures known issues observed during verification: container MTU/TLS handshake failures with github.com, 'failed' blockers not gating dependents, and intermittent polecat skipping of sub-PR creation.

… stale stored value resolveGitHubToken previously preferred git_auth.github_token over the platform integration. Since GitHub App installation tokens have a 1h TTL but git_auth.github_token is only written at rig creation (or rare manual refresh), every long-lived town with an integration was handing out an expired token to: - Polecat/refinery 'gh' CLI (via GH_TOKEN derived from GIT_TOKEN in the container), surfacing as 'Failed to log in to github.com using token (GH_TOKEN). The token in GH_TOKEN is invalid.' - The worker-side PR poller (checkPRStatus, checkPRFeedback, mergePR, areThreadsBlocking) — 401 from api.github.com. - The /refresh-git-token endpoint the container falls back to on auth failure — it returned the same expired token, so the retry just re-failed. Verified by hitting api.github.com with a local town's stored token: 401 even though the integration service mints fresh ones fine. Fix: - Flip resolveGitHubToken's priority to github_cli_pat -> live integration -> stored github_token (last-resort fallback for towns with no integration). Empty-string responses from the integration service now warn and fall back instead of silently failing. - Resolve a fresh token at agent dispatch (startAgentInContainer), merge dispatch (startMergeInContainer), and rig setup (setupRigRepoInContainer) before stuffing GIT_TOKEN into envVars. - buildContainerConfig now resolves a fresh token before serializing git_auth.github_token into the X-Town-Config header — the container's syncTownConfigToProcessEnv path reads this on every request to update process.env.GIT_TOKEN, which buildLiveHotSwapEnv then derives GH_TOKEN from on token-refresh hot-swaps. townId is required (not optional) so a forgotten arg can't silently regress to the stale-token shape. - syncConfigToContainer resolves a fresh token before persisting GIT_TOKEN to DO storage for next boot. Adds 6 unit tests covering the priority chain (cli_pat preferred, fresh integration over stale stored, fallback on lookup failure, rig-level integration ID, no-config returns null).

…3160) * fix(gastown): distinguish null causes in PR status polling (#3149) Replace PRStatusResult | null return type with discriminated PRStatusOutcome union in checkPRStatus. Each null cause (no token, HTTP error, invalid response, unrecognized URL, host mismatch) now surfaces a structured PRStatusError with actionable failure messages. - resolveGitHubToken returns GitHubTokenResolution with resolution chain - no_token and non-transient HTTP errors (401/403/404) fail immediately - invalid_response/unrecognized_url/host_mismatch fail after 3 strikes - Transient HTTP errors (5xx/429) keep existing 10-strike behavior - poll_null_count resets to 0 on successful poll at both call sites - failureKind persisted to bead metadata for analytics - AE event pr.poll_failed emitted on terminal failure - Unit tests for checkPRStatus, resolveGitHubToken, failureMessageFor, and threshold logic - Integration test for no_token immediate-fail path * style: apply oxfmt formatting * fix(gastown): track integration source when GIT_TOKEN_SERVICE unbound (review on town-scm.ts:66) When integrationId is set but GIT_TOKEN_SERVICE binding is missing, the configured integration source was silently omitted from the tried array. Add an else branch that pushes the source label with a '(GIT_TOKEN_SERVICE not bound)' annotation so the no_token error message lists all attempted sources. * fix(gastown): fail immediately for unrecognized_url and host_mismatch (review on actions.ts:374) Both are deterministic configuration errors that cannot self-resolve on retry. Move them from the 3-strike bucket to the fail-immediately bucket alongside no_token and non-transient http_error. Only invalid_response remains in the 3-strike category. * fix(gastown): use separate counters for transient vs non-transient poll errors (review on actions.ts:1350) Replace the shared poll_null_count with poll_transient_count and poll_non_transient_count. Each error category increments only its own counter and resets the other, preventing cross-contamination where 9 transient errors followed by 1 non-transient error would incorrectly fail the bead. Legacy poll_null_count is migrated on first read: the transient branch falls back to poll_null_count when poll_transient_count is absent. This ensures in-flight beads at deploy time retain their existing counter value. The non-transient branch does not read the legacy field since these counters reset on every success anyway — at worst an in-flight bead gets one extra retry for invalid_response. * fix(gastown): resolve merge conflict in resolveGitHubToken - merge staging priority with PR #3160 structured return type - resolveGitHubToken now uses staging's priority: cli_pat → integration → stored token - Returns GitHubTokenResolution discriminated union (from PR #3160) - Includes unbound-service else branch (GIT_TOKEN_SERVICE not bound) - Adds resolveGitHubTokenString helper for non-error-aware callers - Updates Town.do.ts, container-dispatch.ts, config.ts to use helper - Updates town-scm.test.ts for GitHubTokenResolution return shape - Updates pr-poll-errors.test.ts for new priority order --------- Co-authored-by: John Fawcett <john@kilcoode.ai>

… awaitHydration in /refresh-token The /refresh-token handler assigned process.env.GASTOWN_CONTAINER_TOKEN before awaiting hydration, inconsistent with PATCH /agents/:id/model which gates first. Mid-hydration token refresh could cause buildPrewarmEnv to pick up a different token than the one hydration captured locally.

…stead of module global The _resolveHydration module-global stale-capture pattern would orphan the first promise's resolver if bootHydration() were ever called concurrently. Capturing resolve as a local inside bootHydration() itself eliminates the risk and removes the module-global.

… getMayorPrewarmContext getMayorPrewarmContext now returns { agentId } even when the kilocode token is unavailable (instead of null), so the worker route no longer needs to fall through to getMayorAgentId. This eliminates the redundant agents.listAgents SQL query over a second RPC hop.

…igId routes The per-route tagging middleware registered prefixes under /api/orgs/:orgId/... but missed the parallel /api/users/:userId/rigs/:rigId family. Without this, requests to those routes lack rigId in structured log tags.

…oll counter

jrf0110 · 2026-05-11T16:43:56Z

Review observation dispositions

Observation A — "Request/response logging removed without replacement"

Intentional — PR #3158 deletes those manual log lines because instrumented() already emits structured AE events with route, userId, townId, rigId, agentId, beadId, durationMs, and error per route. The new per-route Hono-param tagging middleware preserves the tagging side of the old block. No tracing observability is lost; structured tracing replaces it.

Observation C — "Double `GIT_TOKEN_SERVICE.getToken` call per agent start"

Acknowledged. The second GIT_TOKEN_SERVICE.getToken is a KV cache hit, so the perf impact is negligible, but the duplication is real — will track as a separate cleanup since deduping requires changing buildContainerConfig's signature beyond what's in scope for this release.

Additional thread resolved

A 4th inline thread about a misleading migration comment in actions.ts was also addressed: the comment on the non-transient poll counter branch claimed poll_null_count migration, but the SQL doesn't include it (correctly, since invalid_response is a new error kind). Fixed in c65fbc2.

kilo-code-bot Bot reviewed May 9, 2026

View reviewed changes

Comment thread services/gastown/container/src/control-server.ts Outdated

Comment thread services/gastown/src/gastown.worker.ts Outdated

Comment thread services/gastown/container/src/process-manager.ts Outdated

jrf0110 and others added 4 commits May 10, 2026 18:00

jrf0110 changed the title ~~fix(gastown): unblock /agents/start during boot hydration; preserve mayor tools on prewarm~~ release: gastown-staging -> main May 11, 2026

John Fawcett added 4 commits May 11, 2026 16:21

kilo-code-bot Bot reviewed May 11, 2026

View reviewed changes

Comment thread services/gastown/src/dos/town/actions.ts Outdated

fix(gastown): correct misleading migration comment in non-transient p…

c65fbc2

…oll counter

chore(gastown): apply oxfmt formatting

6022467

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

release: gastown-staging -> main#3151

release: gastown-staging -> main#3151
jrf0110 wants to merge 11 commits into
mainfrom
gastown-staging

jrf0110 commented May 9, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

kilo-code-bot Bot commented May 9, 2026 •

edited

Loading

Uh oh!

Uh oh!

jrf0110 commented May 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jrf0110 commented May 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Constituent commits

1. Boot hydration + mayor prewarm fix (2ffcef28f, direct push)

Changes

1. Hydration gate (control-server.ts, process-manager.ts)

2. Prewarm config matches /agents/start (Town.do.ts, gastown.worker.ts, process-manager.ts)

3. Mayor workdir + plugin env (agent-runner.ts, process-manager.ts)

4. wrangler.jsonc

2. Remove manual request logging middleware (#3158, a6cf1029b)

3. Convoy debug endpoints + E2E test procedure (7f9121ffa, direct push)

4. Fresh integration tokens for GitHub auth (ce15a6fe7, direct push)

5. Distinguish null causes in PR status polling (#3160, 63873e425)

Verification

Visual Changes

Reviewer Notes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

kilo-code-bot Bot commented May 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review Summary

Uh oh!

Uh oh!

jrf0110 commented May 11, 2026

Review observation dispositions

Observation A — "Request/response logging removed without replacement"

Observation C — "Double GIT_TOKEN_SERVICE.getToken call per agent start"

Additional thread resolved

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

jrf0110 commented May 9, 2026 •

edited

Loading

1. Boot hydration + mayor prewarm fix (`2ffcef28f`, direct push)

1. Hydration gate (`control-server.ts`, `process-manager.ts`)

2. Prewarm config matches `/agents/start` (`Town.do.ts`, `gastown.worker.ts`, `process-manager.ts`)

3. Mayor workdir + plugin env (`agent-runner.ts`, `process-manager.ts`)

4. `wrangler.jsonc`

2. Remove manual request logging middleware (#3158, `a6cf1029b`)

3. Convoy debug endpoints + E2E test procedure (`7f9121ffa`, direct push)

4. Fresh integration tokens for GitHub auth (`ce15a6fe7`, direct push)

5. Distinguish null causes in PR status polling (#3160, `63873e425`)

kilo-code-bot Bot commented May 9, 2026 •

edited

Loading

Observation C — "Double `GIT_TOKEN_SERVICE.getToken` call per agent start"